Exploration and exploitation balance management in fuzzy reinforcement learning

نویسندگان

Vali Derhami

Vahid Johari Majd

Majid Nili Ahmadabadi

چکیده

This paper offers a fuzzy balance management scheme between exploration and exploitation, which can be implemented in any critic-only fuzzy reinforcement learning method. The paper, however, focuses on a newly developed continuous reinforcement learning method, called fuzzy Sarsa learning (FSL) due to its advantages. Establishing balance greatly depends on the accuracy of action value function approximation. At first, the overfitting problem in approximating action value function in continuous reinforcement learning algorithms is discussed, and a new adaptive learning rate is proposed to prevent this problem. By relating the learning rate to the inverse of “fuzzy visit value” of the current state, the training data set is forced to have uniform effect on the weight parameters of the approximator and hence overfitting is resolved. Then, a fuzzy balancer is introduced to balance exploration vs. exploitation by generating a suitable temperature factor for the Softmax formula. Finally, an enhanced FSL (EFSL) is offered by integrating the proposed adaptive learning rate and the fuzzy balancer into FSL. Simulation results show that EFSL eliminates overfitting, well manages balance, and outperforms FSL in terms of learning speed and action quality. © 2009 Elsevier B.V. All rights reserved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploration and Exploitation Tradeoff using Fuzzy Reinforcement Learning

Difficulty of making a balance between exploration and exploitation in multiagent environment is a dilemma that does not have a clear answer and there are still different methods for investigation of this problem that all refer to it. In this paper, we provide a method based on fuzzy variables for making exploration and exploitation in multiagent environment. In this method, an effective agent ...

متن کامل

How an Adaptive Learning Rate Benefits Neuro-Fuzzy Reinforcement Learning Systems

To acquire adaptive behaviors of multiple agents in the unknown environment, several neuro-fuzzy reinforcement learning systems (NFRLSs) have been proposed Kuremoto et al. Meanwhile, to manage the balance between exploration and exploitation in fuzzy reinforcement learning (FRL), an adaptive learning rate (ALR), which adjusting learning rate by considering “fuzzy visit value” of the current sta...

متن کامل

Control of exploitation-exploration meta-parameter in reinforcement learning

In reinforcement learning (RL), the duality between exploitation and exploration has long been an important issue. This paper presents a new method that controls the balance between exploitation and exploration. Our learning scheme is based on model-based RL, in which the Bayes inference with forgetting effect estimates the state-transition probability of the environment. The balance parameter,...

متن کامل

A Survey of Exploration Strategies in Reinforcement Learning

A fundamental issue in reinforcement learning algorithms is the balance between exploration of the environment and exploitation of information already obtained by the agent. This paper surveys exploration strategies used in reinforcement learning and summarizes the existing research with respect to their applicability and effectiveness.

متن کامل

Realworld Robot Navigation by Two Dimensional Evaluation Reinforcement Learning

The trade-off of exploration and exploitation is present for a learnig method based on the trial and error such as reinforcement learning. We have proposed a reinforcement learning algorism using reward and punishment as repulsive evaluation(2D-RL). In the algorithm, an appropriate balance between exploration and exploitation can be attained by using interest and utility. In this paper, we appl...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Fuzzy Sets and Systems

دوره 161 شماره

صفحات -

تاریخ انتشار 2010

Exploration and exploitation balance management in fuzzy reinforcement learning

نویسندگان

چکیده

منابع مشابه

Exploration and Exploitation Tradeoff using Fuzzy Reinforcement Learning

How an Adaptive Learning Rate Benefits Neuro-Fuzzy Reinforcement Learning Systems

Control of exploitation-exploration meta-parameter in reinforcement learning

A Survey of Exploration Strategies in Reinforcement Learning

Realworld Robot Navigation by Two Dimensional Evaluation Reinforcement Learning

عنوان ژورنال:

اشتراک گذاری